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AMENDMENTS TO THE CLAIMS: 
1-71. (Canceled) 

72. (Currently Amended) A method for identifying similar biopolvmers 
b i opo l ymor s e quonoo c characterized by a topolog i ca l pottom of matoh otatoo , the 
method comprising the steps of: 

constructing a statistical model comprising a hidden Markov Model of a set of 
known sequences that correspond to defined regions of a set of biopolymer sequences 
to provide characterized by a characteristic topological pattern of match states between 
the biopolymer sequences, each match state characterized by a scoring matrix, wherein 
the scoring matrix for a first match state defines a state of similarity for a conserved 
region of the biooolvmer sequences a nd the scoring matrix for a second match state 
defines a state of dissimilarit y for a divergent region of the biopolymer sequences. [[;]] 
the model comprising one or more modules of nodes; [[,]] 

comparing the set of biopolymer sequences to the statistical model bv evaluating 
th e topolog i ca l p a tt e rn of match states of tho b i opo l ymor s e quences to the topo l ogical 
pattern of match stat es of th e s e t of known sequ e nc e s and comparing the scoring 
matrices of thematch states to provide an output score d e term i ng the state of s i mi l arity 
or th e state of d i ssimilar i ty of the b i opolym e r soqu e nc e s and th e s e t of known 
ooquoncoo ; and 

id e ntifying tho biopo l ymor soquoncoo determining a likelihood that the set of 
biopolymer sequence s is represented bv the model and thereby similar biopolvmers 
based on the score state of s i m il arity or tho state of d i cs i m i larity w i th tho cot of known 
s e qu e nces , 

73. (Previously Presented) The method of claim 72, wherein the step of 
constructing the model comprises: 

determining the topological pattern of match states of the set of known 
sequences; 
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preparing at least one module of nodes for each match state; and 
linking the modules of nodes to form the model, 

74. (Previously Presented) The method of claim 73, wherein the step of 
preparing the modules of nodes comprises: 

programming the modules of nodes against a training set of data objects 
characteristic of the topology pattern of match states of the set of known sequences; 
and 

tuning the nodes in an iterative process until the modules encompass the training 
set of data objects. 

75. (Previously Presented) The method of claim 74, wherein the step of 
programming the modules of nodes comprises defining the scoring matrix for each 
match state. 

76. (Canceled) 

77. (Previously Presented) The method of claim 75, wherein the scoring matrix 
defining a state of dissimilarity is a function of the scoring matrix defining a state of 
similarity. 

78. (Previously Presented) The method of claim 75, wherein the scoring matrix 
defining a state of dissimilarity is a function of the arithmetic inverse of the scoring 
matrix defining a state of similarity. 

79. (Cancelled) 

80. (Previously Presented) The method of claim 72, wherein the set of known 
sequences consists of two sequences. 
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81 . (Previously Presented) The method of claim 72, wherein the set of known ' 
sequences comprises at least three sequences. 

82. (Previously Presented) The method of claim 72, wherein the set of known 
sequences comprises amino acid sequences. 

83. (Previously Presented) The method of claim 72, wherein the set of known 
sequences comprises nucleic acid sequences. 

84. (Previously Presented) The method of claim 72, wherein one or more nodes 
represent an insertion at a first position in the set of known sequences. 

85. (Previously Presented) The method of claim 72, wherein one or more nodes 
represent a deletion at a second position in the set of known sequences. 

86. (Previously Presented) The method of claim 72, wherein each node 
represents a distribution of monomers at defined positions in the set of known 
sequences. 

87. (Previously Presented) The method of claim 86, wherein the distribution of 
monomers at a first node is different from the distribution of monomers at a second 
node. 

88. (Previously Presented) The method of claim 87, wherein the distribution of 
monomers is a function of a scoring matrix that relates the distribution of monomers at a 
first node and a scoring matrix that relates the distribution of monomers at a second 
node. 

89. (Previously Presented) The method of claim 88, wherein the scoring matrix 
is a function of independent probabilities of a monomer occurrence. 
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90. (Currently Amended) The method of claim 89, wherein the distribution P(a f b) 
of monomers a and b, a scoring matrix S(a,b), and independent probabilities of 
monomers, Q(a) and Q(b) are related such that S(a,b) = log(P(a,b) / (Q(a) • Q(b))). 

91 . (Previously Presented) The method of claim 72, wherein the model 
comprises a first module which characterizes the match state between the set of known 
sequences in a first region and a second module which characterizes the match state 
between the set of known sequences in a second region; wherein the match states of 
the first and second module are different. 

92. (Previously Presented) The method of claim 91 r wherein the model further 
comprises a third module that characterizes the match state between the set of known 
sequences in a third region. 

93. (Previously Presented) The method of claim 92, wherein the third module 
is positioned between the first and second module with respect to the order of the set of 
known sequences. 

94. (Previously Presented) The method of claim 93, wherein the third module 
indicates similarity between a third region of each set of known sequence, and a 
sequence profile characterized by altered scoring matrices. 

95. (Previously Presented) The method of claim 94, wherein the sequence 
profile comprises a profile of a modification site. 

96. (Previously Presented) The method of claim 95, wherein the modification site 
is a processing site. 
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97. (Previously Presented) The method of claim 96, wherein the processing site 
indicates a preference for at least one basic residue. 

98. (Previously Presented) The method of claim 96, wherein the processing site 
indicates a preference for at least two basic residues. 

99. (Previously Presented) The method of claim 96, wherein the processing site 
comprises a convertase processing site. 

100. (Previously Presented) The method of claim 96, wherein the processing 
site comprises a secretase processing site. 

101. (Previously Presented) The method of claim 72, wherein the biopolymer 
sequences comprise sequences from different species. 

102. (Previously Presented) The method of claim 101, wherein the different 
species comprise mammalian species. 

103. (Previously Presented) The method of claim 72, wherein the set of known 
sequences comprise genomic nucleic acid sequences. 

104. (Previously Presented) The method of claim 72, wherein the set of known 
sequences comprises non-coding regions. 

105. (Previously Presented) The method of claim 72, wherein the set of known 
sequences comprises regulatory regions. 

106. (Previously Presented) The method of claim 72, wherein the set of known 
sequences comprises transcriptional regulatory regions. 
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107. (Currently Amended) A method for identifying similar biopolvmer 
sequences characteriz e d by a topolog i cal patt e rn of match stat es, the method 
comprising the steps of: 

constructing a statistical model comprising a hidden Markov Model of a set of 
known sequence s that correspond to defined regions of a set of biopolvmer sequences 
to provide charact e riz e d by a characteristic t opo logical pattern of match states between 
the biopolvmer sequences , each match state characterized by a scoring matrix, wherein 
the scoring matrix for a first match state defines a state of similarity for a conserved 
region of the biopolvmer sequences and the scoring matrix for a second match state 
defines a state of dissimilarity for a divergent region of the biopolvmer sequences , the 
model comprising one or more modules of nodes, each module of nodes representing a 
different match state, wherein the step of constructing the model comprises 
programming the modules of nodes against a training set of data objects characteristic 
of the match state of the set of known sequences, tuning the nodes in an iterative 
process until the modules encompass the training set of data objects, and linking the 
modules to form the model; 

comparing the set of biopolvmer sequences to the statistical model by evaluating 
tho topo l og i ca l patt e rn of match s tat es of th e b i opo l ym e r se quences to th e topo l ogica l 
patt e rn of match states of th e s e t of known soquonoos and compar i ng the scoring 
matrices for each match state to provide an output score determine th e stat e of 
s i milarity or th e stato of d i s s imilar i ty of th e biopo l ym e r s equ e nces and tho sot of known 
sequ e nc e s ; and 

i dent i fy i ng th e b i opolvm e r sequ e nc es determining a likelihood that the set of 
biopolymer sequences is represented bv the model and thereby similar biopolvmers 
based on the score stat e of sim il ar i ty or th e s tat e of d i ssimi l arity with the s e t of known 

108. (Previously Presented) The method of claim 107, wherein the step of 
programming the modules of nodes further comprises the step of defining a scoring 
matrix for each match state. 
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109-110. (Canceled) 

111. (Previously Presented) The method of claim 108, wherein the scoring 
matrix defining a state of dissimilarity is a function of the scoring matrix defining a state 
of similarity. 

112. (Previously Presented) The method of claim 108, wherein the scoring 
matrix defining the state of dissimilarity is a function of the arithmetic inverse of the 
scoring matrix defining a state of similarity. 

113. (Canceled) 

114. (Previously Presented) The method of claim 107, wherein the set of known 
sequences consists of two sequences. 

115. (Previously Presented) The method of claim 107, wherein the set of known 
sequences comprises at least three sequences. 

116. (Previously Presented) The method of claim 107, wherein the set of known 
sequences comprises amino acid sequences. 

117. (Previously Presented) The method of claim 107, wherein the set of known 
sequences comprises nucleic acid sequences. 

1 1 8. (Previously Presented) The method of claim 107, wherein one or more 
nodes represent an insertion at a first position in the set of known sequences. 

119. (Previously Presented) The method of claim 107, wherein one or more 
nodes represent a deletion at defined positions in the set of known sequences. 
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120. (Previously Presented) The method of claim 107, wherein each node 
represents a distribution of monomers at defined positions in the set of known 
sequences. 

121. (Previously Presented) The method of claim 120, wherein the distribution 
of monomers at a first node is different from the distribution of monomers at a second 
node. 

122. (Previously Presented) The method of claim 121 , wherein the distribution of 
monomers is a function of a scoring matrix that relates the distribution of monomers at a 
first node and a scoring matrix that relates the distribution of monomers at a second 
node. 

123. (Previously Presented) The method of claim 122, wherein the scoring 
matrix is a function of independent probabilities of a monomer occurrence. 

124. (Currently Amended) The method of claim 123, wherein the distribution 
P(a,b) of monomers a and b, a scoring matrix S(a,b), and independent probabilities of 
monomers, Q(a) and Q(b) are related such that S(a,b) = log(P(a,b) / (Q(a) • Q(b)l). 

125. (Previously Presented) The method of claim 107, wherein the model 
comprises a first module which characterizes a match state between the set of known 
sequences in a first region and a second module which characterizes a match state 
between the set of known sequences in a second region; wherein the match states of 
the first and second modules are different. 

126. (Previously Presented) The method of claim 125, wherein the model further 
comprises a third module that characterizes a match state between the set of known 
sequences in a third region. 
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127. (Previously Presented) The method of claim 126, wherein the third module 
is positioned between the first and second module with respect to the order of the set of 
known sequences, 

128. (Previously Presented) The method of claim 127, wherein the third module 
indicates similarity between a third region of each set of known sequence, and a 
sequence profile characterized by altered scoring matrices. 

129. (Previously Presented) The method of claim 128 f wherein the sequence 
profile comprises a profile of a modification site. 

130. (Previously Presented) The method of claim 129, wherein the modification 
site is a processing site. 

131. (Previously Presented) The method of claim 130, wherein the processing 
site indicates a preference for at least one basic residue. 

132. (Previously Presented) The method of claim 130, wherein the processing 
site indicates a preference for at least two basic residues. 

1 33. (Previously Presented) The method of claim 130, wherein the processing 
site comprises a convertase processing site. 

134. (Previously Presented) The method of claim 130, wherein the processing 
site comprises a secretase processing site. 

135. (Previously Presented) The method of claim 107, wherein the biopolymer 
sequences comprise sequences from different species. 
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136. (Previously Presented) The method of claim 134, wherein the different 
species comprise mammalian species. 

1 37. (Previously Presented) The method of claim 107, wherein the set of known 
sequences comprise genomic nucleic acid sequences. 

138. (Previously Presented) The method of claim 107, wherein the set of known 
sequences comprises non-coding regions. 

1 39. (Previously Presented) The method of claim 1 07, wherein the set of known 
sequences comprises regulatory regions. 

140. (Previously Presented) The method of claim 107, wherein the set of known 
sequences comprises transcriptional regulatory regions. 

141. (Canceled) 

142. (Currently Amended) A computer-readable medium having stored thereon 
a plurality of instructions, the plurality of instructions including instructions which, when 
executed by a processor, cause the processor to perform the steps of a method for 
identifying similar biopolvmers b i opo l vm e r s e quoncoo characteriz e d by a topo l og i ca l 
patt e rn of match s tat es, comprising-ef: 

constructing a statistical model comprising a hidden Markov Model of a set of 
known sequences that correspond to defined regions of biopolvmer sequences in a set 
of biopolvmers to provide charact e r i zed by a characteristic t opological pattern of match 
states between the biopolvmer sequences , each match state characterized by a scoring 
matrix, wherein the scoring matrix for a first match state defines a state of similarity for a 
conserved region of the biopolvmer sequences and the scoring matrix for a second 
match state defines a state of dissimilarity for a divergent region of the biopolvmer 
sequences. [[;]] the model comprising one or more modules of nodes; [[,]] 
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comparing the set of biopolymer sequences to the statistical model by evaluating 
th e topo l og i ca l patt e rn of matoh states of tho b i opolyrn e r s e quonoos to tho topo l ogica l 
patt e rn of match st a t e s of th e s e t of known s e qu e nces and comparing the scoring 
matrices of the match states to provide a score d e term i ne tho state of s i m il ar i ty or th e 
s tat e of diss i m i larity of th e b i opo l ym e r se quenc e s and th e set of known sequences ; and 

outputting the score indicative of a likelihood that the set of biopolymer 
sequences is represented bv the model and thereby similar biopolvmers i d e nt i fy i ng th e 
b i opo l ym e r s e qu e ncos bas e d on th e state of s i mi l ar i ty or th e stat e of diss i m il ar i ty with 
th e s e t of known s e qu e nc e s . 
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